Performance of a Multi-class Biomedical Tagger on Clinical Records

نویسندگان

  • S. V. Ramanan
  • Shereen Broido
  • P. Senthil Nathan
چکیده

We tested the performance of Cocoa, an existing dictionary/rule based entity tagger that tags multiple semantic types in biomedical domain including diseases, on disease/sign/symptom detection in clinical records in the ShARe/CLEF eHealth task. Initial analysis showed that the precision was high (≥ 90%), but recall was low (≈ 50%) due to (a) phrases peculiar to clinical notes (b) disambiguation of common words and (c) the large number of undefined acronyms. We extended the system to handle these cases by reference to the local intrasentential context as derived from the training set. A small module was also added for event-based detection of annotated sentence fragments containing verbs/gerunds; an example is ‘LV systolic function appears depressed’. The event detection system had about 30 rules. With these modifications, the f-score was 0.75 on the test set. In a second run, we added about 70 frequently occurring acronyms as well 15 phrases which were all in caps. The final results on the test set (f = 0.78) show that a multi-class tagger can work reasonably well on clinical records.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Studying impressive parameters on the performance of Persian probabilistic context free grammar parser

In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...

متن کامل

بررسی مقایسه‌ای تأثیر برچسب‌زنی مقولات دستوری بر تجزیه در پردازش خودکار زبان فارسی

In this paper, the role of Part-of-Speech (POS) tagging for parsing in automatic processing of the Persian language is studied. To this end, the impact of the quality of POS tagging as well as the impact of the quantity of information available in the POS tags on parsing are studied. To reach the goals, three parsing scenarios are proposed and compared. In the first scenario, the parser assigns...

متن کامل

Evaluation of the Effect of Presence of Health Information Technology Expert on Medical Records of Patients Admitted to Fatemeh Zahra Hospital, Sari, Iran

Background: Documenting medical records plays an important role in treatment and prevention. The purpose of this study was to evaluate the impact of the presence of health information technology experts in clinical wards on the documentation of hospital admissions files. Methods: In this descriptive cross-sectional study, 96 inpatient records in 2014 and 96 inpatient records in Fatemeh Zahra H...

متن کامل

One Tagger, Many Uses: Illustrating the Power of Ontologies in Dictionary-based Named Entity Recognition

Automatic annotation of text is an important complement to manual annotation, because the latter is highly labour intensive. We have developed a fast dictionary-based named entity recognition (NER) system and addressed a wide variety of biomedical problems by applied it to text from many different sources. We have used this tagger both in real-time tools to support curation efforts and in pipel...

متن کامل

Parts-of-Speech Tagger Errors Do Not Necessarily Degrade Accuracy in Extracting Information from Biomedical Text

Background: An ongoing assessment of the literature is difficult with the rapidly increasing volume of research publications and limited effective information extraction tools which identify entity relationships from text. A recent study reported development of Muscorian, a generic text processing tool for extracting proteinprotein interactions from text that achieved comparable performance to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013